Automatic Acquisition of Machine Translation Resources in the Abu-MaTran Project
نویسندگان
چکیده
This paper provides an overview of the research and development activities carried out to alleviate the language resources’ bottleneck in machine translation within the Abu-MaTran project. We have developed a range of tools for the acquisition of the main resources required by the two most popular approaches to machine translation, i.e. statistical (corpora) and rule-based models (dictionaries and rules). All these tools have been released under open-source licenses and have been developed with the aim of being useful for industrial exploitation.
منابع مشابه
Abu-MaTran at WMT 2015 Translation Task: Morphological Segmentation and Web Crawling
This paper presents the machine translation systems submitted by the Abu-MaTran project for the Finnish–English language pair at the WMT 2015 translation task. We tackle the lack of resources and complex morphology of the Finnish language by (i) crawling parallel and monolingual data from the Web and (ii) applying rule-based and unsupervised methods for morphological segmentation. Several stati...
متن کاملAbu-MaTran: Automatic building of Machine Translation
We present the current status of Abu-MaTran (http://www.abumatran.eu), a 4-year project (January 2013–December 2016) on rapid development of machine translation for underresourced languages. It is funded under Marie Curie's Industry-Academia Partnerships and Pathways 2012 programme. This is a consortium-based project with 5 partners (4 academic and 1 industrial).
متن کاملAbu-MaTran at WMT 2016 Translation Task: Deep Learning, Morphological Segmentation and Tuning on Character Sequences
This paper presents the systems submitted by the Abu-MaTran project to the Englishto-Finnish language pair at the WMT 2016 news translation task. We applied morphological segmentation and deep learning in order to address (i) the data scarcity problem caused by the lack of in-domain parallel data in the constrained task and (ii) the complex morphology of Finnish. We submitted a neural machine t...
متن کاملCollaborative Development of a Rule-Based Machine Translator between Croatian and Serbian
This paper describes the development and current state of a bidirectional CroatianSerbian machine translation system based on the open-source Apertium platform. It has been created inside the Abu-MaTran project with the aims of creating free linguistic resources as well as having non-experts and experts work together. We describe the collaborative way of collecting the necessary data to build o...
متن کاملThe Correlation of Machine Translation Evaluation Metrics with Human Judgement on Persian Language
Machine Translation Evaluation Metrics (MTEMs) are the central core of Machine Translation (MT) engines as they are developed based on frequent evaluation. Although MTEMs are widespread today, their validity and quality for many languages is still under question. The aim of this research study was to examine the validity and assess the quality of MTEMs from Lexical Similarity set on machine tra...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Procesamiento del Lenguaje Natural
دوره 55 شماره
صفحات -
تاریخ انتشار 2015